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Abstract — THIS PAPER IS ELIGIBLE FOR THE STUDENT 
PAPER AWARD. The closure of the set of entropy functions 
associated with n discrete variables, F„, is a convex cone in 
(2™ — 1) -dimensional space, but its full characterization remains 
an open problem. In this paper, we map F n to an n-dimensional 
region 5> n by averaging the joint entropies with the same number 
of variables, and show that the simpler $ n can be characterized 
solely by the Shannon-type information inequalities. 

I. Introduction 

Given an n-dimensional discrete random vector X = 
(X\, . . . ,X n ), for each non-empty subset a of N = 
{1,2, ... ,n} there is a joint entropy H(X a ) with X a = 
(Xi)i(z a , and the 2" — 1 joint entropies form the entropy 
function [H (X a )) a (zM , a ^=<b of X. We can then define T* C 
R 2 _1 as the set of all possible entropy functions involving 
n discrete random variables, and T n as its closure. A vector 
H e R 2 _1 is called enttopic if H e T* , and almost entropic 

if Her; [l]. 

All H = (-ff Q ) Q c./V>^0 G r n satisfy the following 
Shannon-type information inequalities for any subsets a, (3 
of M (we let H$ = for convenience): 



H, 



H a >0, 
<H P , aC 13, 



H a + Hp > H, 



<(ctU/3) 



H 



(an/3)- 



(1) 

(2) 
(3) 



However, (Q]i-® are not sufficient conditions for an H G 
R 2 _1 to be almost entropic when n > 4 [2]. In other words, 
denoting by T n the set of vectors in R 2 _1 satisfying ([TJ-©, 
we have 

It CT„, n > 4. (4) 

A number of non-Shannon-type information inequalities sat- 
isfied by the members of r„ have subsequently been found in 
[2]— [4], but the full characterization of r„ remains an open 
problem. 

In this paper, we will show that an averaged version of r„ 
can be more easily characterized. 

Definition 1: For a vector H = (i?a)acAA,a#0 G K 2 _1 , 
we define its average as 



*(H) = {h\, . 



,hn) 



(5) 



where hk — (1) S| Q |=fc-^«- ^ ^ ^ s entropy function 
of random vector X, we call h = ^(H) the average entropy 
function. * then maps T* to the set = *(r*) of all 



average entropy functions, r n to the closure $„, and r„ to 

*n = *(T„). 

From the definition ([TJ — d3j of T n , $„ can be given by 



$n = {(hi, 



.,h n ) \ h k - 
k = 



i - 2h k + hk+i < 0, 



1. 



,n}, 



(6) 



where we let ho — and h n+ i = h n for convenience. $ n is 
obviously a subset of $„ since r„ C r„, but we will show that 
they are actually equal. In other words, is characterizable 
solely with the Shannon-type information inequalities. 
— * 

Theorem 1: $ n = <!>„. 

This theorem will be proved in the next section. 

II. Proof of the Theorem 

It is only necessary to prove that C <!>„. We first 
introduce a one-to-one transform to give $ n a simpler form. 

Definition 2: For a vector h = (hi,...,h n ) G R™, we 
define its second-order difference as 



©(h) = (gi, ■ ■ -,9n), 



(7) 



where g k = hk-i — 2hk + hk+i, k = 1, . . . ,n, with ho = 
and hn+i = K- ® maps $* to A* = 6($*), $* to A*, and 

*„ to a„ 4 e($ B ). 

From (|6]l, we have 



A« = {(gi,---,g n ) Ifffc < o, k = i, 



>}■ 



(8) 



As ^ and 9 are both linear maps, and r„ is a convex cone 
[5], <£>„ and A n are both convex cones as well. Therefore, to 
prove that <!>„ C $ n or equivalently A„ C A ra , it is sufficient 
to prove that 



,o)£ a; 



(9) 



for k = 1, . . . , n and some a > 0. In other words, for each 
k we need to find a random vector X whose average entropy 
function is 



h fe = e- 1 (g fe ) = a-(l,2, 



,,k). 



(10) 



This X can be constructed from a Reed-Solomon code. 
Specifically, let q be a power-of-two larger than n, C be the 
codeword set of an (n, k) Reed-Solomon code on GF(q), 
and X = (Xi,...,X n ) be a random codeword uniformly 



distributed over C, then the entropy function of X is (fTOb with 
a = log q, as shown below. 

Let ji , . . . , j n be distinct indices in 1, . . . , n. Accord- 
ing to the properties of Reed-Solomon codes, given any 
x* ± ,...,x* k € GF(q), there exists a unique x = 
(xi, . . . ,x n ) £ C with Xj l — x* r I = 1, . . . , k. For any x* t 6 
GF(q), there are thus q k 1 codewords x 6 C with Xj 1 — 
one for each value combination on k — 1 other positions, and 
since X is equal to each codeword with probability q~ k , we 
have p(Xj ± = XjJ = q~ x , so H(Xj 1 ) = logq. Similarly, 
H(X n , X n ) = 2 log q, . . . , H(X n , . . . , Xj J = k log q. For 
I = k + 1, . . . , n, given Xj 1 , . . . , Xj t , there is either one match- 
ing codeword in C or none, therefore p(Xj 1 = Xj ± , . . . , Xj l = 
Xj t ) is q~ k on its support, and H (Xj x , . . . , Xj t ) = klogq. 
Consequently, the average entropy function of X is ( fTOb with 
a = \ogq as desired, and for each I, all (?) Z-variable joint 
entropies of X that are being averaged actually have the same 
value. ■ 
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III. Discussion 

— * 

Determination of T n is important due to its close connection 
to the capacity region of general multi-source multi-sink wired 
networks [6], [7], but this seems to be a difficult problem, 
and even if a full characterization is found, computational 
difficulties due to F n 's high dimensionality and complex 
structure might reduce its usefulness in practice [8]. What we 
have shown is that the region <l> n obtained by averaging the 
/c-variable joint entropies has a much simpler structure: it is 
not affected by the non-Shannon information inequalities, and 
the linear Reed-Solomon codes used in the proof suggest that 
the suboptimality of linear network coding is also hidden by 
this averaging. On one hand, this means that further work 
on the characterization of r„ must focus on the variation 
among the fc-variable entropies, not just their averages. On 
the other hand, many practically interesting networks have a 
somewhat symmetric structure, possibly in a statistical sense, 
and an appropriately averaged version of r„ (not necessarily 
as simplistic as $„) might provide a tractable method for the 
determination of their capacity regions. 

Average entropy functions are also closely related to the 
MAP EXIT functions discussed in e.g. [9] for large n. 
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